Querying and analysing document collections with Formal Concept Analysis
نویسندگان
چکیده
Formal Concept Analysis (FCA) has been applied to the task of document retrieval in many different ways. In this paper we present a new document management tool, based on FCA and aimed at facilitating the retrieval of documents and the understanding of the structure of collections of standard documents such as PDF, HTML or Word files. The user interface is designed to allow easy access to the tool without prior knowledge of FCA.
منابع مشابه
Approximate Tree Embedding for Querying XML Data
Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملUsing pattern structures to support information retrieval with Formal Concept Analysis
In this paper we introduce a novel approach to information retrieval (IR) based on Formal Concept Analysis (FCA). The use of concept lattices to support the task of document retrieval in IR has proven effective since they allow querying in the space of terms modelled by concept intents and navigation in the space of documents modelled by concept extents. However, current approaches use binary r...
متن کاملDocument Retrieval for E-Mail Search and Discovery Using Formal Concept Analysis
This paper discusses an document discovery tool based on conceptual clustering by formal concept analysis. The program allows users to navigate email using a visual lattice metaphor rather than a tree. It implements a virtual le structure over email where les and entire directories can appear in multiple positions. The content and shape of the lattice formed by the conceptual ontology can assis...
متن کاملIncremental Development of Browsing for Domain-Specific Document Retrieval Systems
Browsing is being supported in many information retrieval systems to supplement Boolean querying. We have implemented a web-based browsing mechanism for a domain-specific document retrieval system based on the concept lattice of Formal Concept Analysis. In this paper, we have proposed and implemented an incremental development of browsing by combing Formal Concept Analysis (FCA) and Ripple Down...
متن کامل